Skip to content

feat(transfers): add permissions transfer functionality and update configuration#546

Merged
jirhiker merged 1 commit intostagingfrom
transfer-fix-permission-history
Feb 25, 2026
Merged

feat(transfers): add permissions transfer functionality and update configuration#546
jirhiker merged 1 commit intostagingfrom
transfer-fix-permission-history

Conversation

@jirhiker
Copy link
Copy Markdown
Member

@jirhiker jirhiker commented Feb 25, 2026

Summary

This PR fixes permissions transfer ordering and adds first-class permissions coverage to transfer options and transfer validation reporting.

Problem

Permissions were silently missing because:

  1. Permissions transfer ran in parallel with Contacts, but permissions require contact associations to exist first.
  2. transfer-results did not include any permissions comparison spec, so missing PermissionHistory rows were not reported.
  3. There was no dedicated transfer toggle for permissions.

Changes

  1. Sequencing fix:

    • Removed permissions execution from parallel group 1.
    • Run permissions immediately after group 1 completes, so contacts/associations are committed first.
  2. New transfer option:

    • Added TransferOptions.transfer_permissions.
    • Wired env var TRANSFER_PERMISSIONS (default True).
    • Added TRANSFER_PERMISSIONS=True to .env.example.
  3. Transfer-results permissions validation:

    • Added PermissionsTransferResult type.
    • Added Permissions entry to TRANSFER_COMPARISON_SPECS with option_field="transfer_permissions".
    • Added dedicated builder logic for permissions that:
      • derives expected permission keys from WellData (SampleOK, MonitorOK) for wells that have contact associations,
      • derives destination keys from PermissionHistory,
      • compares composite keys: pointid|permission_type|permission_allowed,
      • honors disabled state when TRANSFER_PERMISSIONS=false.

Why this works

Permissions transfer now executes only after its prerequisites are present.
Validation now explicitly checks permissions, so zero/missing permissions are surfaced in oco transfer-results output instead of being invisible.

Files changed

  • transfers/transfer.py
  • .env.example
  • transfers/transfer_results_types.py
  • transfers/transfer_results_specs.py
  • transfers/transfer_results_builder.py

Validation

  • python -m py_compile on updated transfer-results modules and transfer.py passed.
  • pytest could not be fully run in sandbox due to blocked DB connection to localhost:5432.

Copilot AI review requested due to automatic review settings February 25, 2026 18:15
@jirhiker jirhiker merged commit 563d5d1 into staging Feb 25, 2026
10 checks passed
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7a51c38596

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +218 to +223
.where(PermissionHistory.target_table == "thing")
.where(
PermissionHistory.permission_type.in_(
("Water Chemistry Sample", "Water Level Sample")
)
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Scope permission destination query to water wells

The permissions source side is explicitly limited to Thing.thing_type == "water well", but the destination query only filters by target_table and permission_type. If the database has PermissionHistory rows for non-well things using the same permission types, this audit will report false extras and inflate destination counts, so source and destination are no longer compared over the same population.

Useful? React with 👍 / 👎.

sample_ok = getattr(row, "SampleOK", None)
if sample_ok is not None:
rows.append(
f"{_normalize_key(pointid)}|Water Chemistry Sample|{bool(sample_ok)}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Parse permission flags instead of casting with bool()

Casting CSV values with bool(sample_ok) treats any non-empty string as True, so values like "False" or "0" will be interpreted as allowed permissions when these columns are string-typed. That produces incorrect source keys and therefore incorrect missing/extra counts in transfer-results output for those datasets.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for auditing and orchestrating a “Permissions” transfer, including a new transfer toggle and updated transfer-results reporting.

Changes:

  • Adds a new Permissions entry to the transfer-results auditing specs/types and implements custom comparison logic for PermissionHistory.
  • Updates the transfer orchestration to run permissions after parallel group 1 (after contacts are available) and adds an env toggle TRANSFER_PERMISSIONS.
  • Updates docs and CI environment configuration.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
transfers/transfer_results_types.py Registers Permissions as a transfer-results type so it appears in auditing output.
transfers/transfer_results_specs.py Adds Permissions to TRANSFER_COMPARISON_SPECS and imports PermissionHistory + result type.
transfers/transfer_results_builder.py Implements permissions-specific source/destination key building and result computation.
transfers/transfer.py Adds transfer_permissions option/env toggle and runs permissions after group 1 completes.
transfers/README.md Documents the transfer auditing CLI usage and output.
.github/workflows/tests.yml Adds PYGEOAPI_POSTGRES_USER to CI env.
.env.example Adds TRANSFER_PERMISSIONS toggle.

Comment on lines +636 to +643
try:
result_name, result, elapsed = _execute_permissions_with_timing(
"Permissions"
)
results_map[result_name] = result
logger.info(f"Task {result_name} completed in {elapsed:.2f}s")
except Exception as e:
logger.critical(f"Task Permissions failed: {e}")
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sequential permissions transfer failure log only includes str(e) and drops the stack trace, while other parallel-transfer error paths log traceback.format_exc(). Logging the full traceback here would make diagnosing transfer failures significantly easier.

Copilot uses AI. Check for mistakes.
Comment on lines +182 to +201
eligible_pointids = {name for (name,) in eligible_rows if name}
if not eligible_pointids:
return pd.Series([], dtype=object)

rows: list[str] = []
for row in wdf.itertuples(index=False):
pointid = getattr(row, "PointID", None)
if pointid not in eligible_pointids:
continue

sample_ok = getattr(row, "SampleOK", None)
if sample_ok is not None:
rows.append(
f"{_normalize_key(pointid)}|Water Chemistry Sample|{bool(sample_ok)}"
)

monitor_ok = getattr(row, "MonitorOK", None)
if monitor_ok is not None:
rows.append(
f"{_normalize_key(pointid)}|Water Level Sample|{bool(monitor_ok)}"
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In _permissions_source_series, membership is checked against eligible_pointids using the raw PointID value from the CSV, but the produced keys are normalized/lowercased via _normalize_key. This can incorrectly drop eligible wells when casing/whitespace differs between CSV and DB values. Normalize both sides (e.g., build eligible_pointids as normalized keys and compare with _normalize_key(pointid)).

Suggested change
eligible_pointids = {name for (name,) in eligible_rows if name}
if not eligible_pointids:
return pd.Series([], dtype=object)
rows: list[str] = []
for row in wdf.itertuples(index=False):
pointid = getattr(row, "PointID", None)
if pointid not in eligible_pointids:
continue
sample_ok = getattr(row, "SampleOK", None)
if sample_ok is not None:
rows.append(
f"{_normalize_key(pointid)}|Water Chemistry Sample|{bool(sample_ok)}"
)
monitor_ok = getattr(row, "MonitorOK", None)
if monitor_ok is not None:
rows.append(
f"{_normalize_key(pointid)}|Water Level Sample|{bool(monitor_ok)}"
eligible_pointids = {
_normalize_key(name) for (name,) in eligible_rows if _normalize_key(name)
}
if not eligible_pointids:
return pd.Series([], dtype=object)
rows: list[str] = []
for row in wdf.itertuples(index=False):
pointid = getattr(row, "PointID", None)
normalized_pointid = _normalize_key(pointid)
if not normalized_pointid or normalized_pointid not in eligible_pointids:
continue
sample_ok = getattr(row, "SampleOK", None)
if sample_ok is not None:
rows.append(
f"{normalized_pointid}|Water Chemistry Sample|{bool(sample_ok)}"
)
monitor_ok = getattr(row, "MonitorOK", None)
if monitor_ok is not None:
rows.append(
f"{normalized_pointid}|Water Level Sample|{bool(monitor_ok)}"

Copilot uses AI. Check for mistakes.
Comment on lines +186 to +197
rows: list[str] = []
for row in wdf.itertuples(index=False):
pointid = getattr(row, "PointID", None)
if pointid not in eligible_pointids:
continue

sample_ok = getattr(row, "SampleOK", None)
if sample_ok is not None:
rows.append(
f"{_normalize_key(pointid)}|Water Chemistry Sample|{bool(sample_ok)}"
)

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_permissions_source_series appends permission rows for every matching WellData record, which can inflate counts and introduce duplicate keys if WellData contains duplicate PointID rows. The actual permissions transfer de-duplicates by well (visited) and effectively uses the first matching WellData value per well. Consider de-duping WellData by PointID (stable) before iterating, or aggregating per PointID to mirror transfer behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +358 to +368
with session_ctx() as session:
source_series = (
_permissions_source_series(session)
if enabled
else pd.Series([], dtype=object)
)
source_keys = set(source_series.unique().tolist())
source_keyed_row_count = int(source_series.shape[0])
source_duplicate_key_row_count = source_keyed_row_count - len(source_keys)
agreed_transfer_row_count = source_keyed_row_count

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_build_permissions reads WellData into source_df to count rows, but _permissions_source_series re-reads WellData internally. This duplicates I/O and parsing work for large CSVs. Consider passing the already-loaded source_df (or a pre-cleaned version) into _permissions_source_series so the file is only read once.

Suggested change
with session_ctx() as session:
source_series = (
_permissions_source_series(session)
if enabled
else pd.Series([], dtype=object)
)
source_keys = set(source_series.unique().tolist())
source_keyed_row_count = int(source_series.shape[0])
source_duplicate_key_row_count = source_keyed_row_count - len(source_keys)
agreed_transfer_row_count = source_keyed_row_count
if enabled:
# Derive the source key series directly from the already-loaded CSV
if spec.source_key_column in source_df.columns:
source_series = source_df[spec.source_key_column].astype(str)
else:
source_series = pd.Series([], dtype=object)
else:
source_series = pd.Series([], dtype=object)
source_keys = set(source_series.unique().tolist())
source_keyed_row_count = int(source_series.shape[0])
source_duplicate_key_row_count = source_keyed_row_count - len(source_keys)
agreed_transfer_row_count = source_keyed_row_count
with session_ctx() as session:

Copilot uses AI. Check for mistakes.
Comment on lines +370 to +381
destination_row_count = int(
session.execute(
select(func.count())
.select_from(PermissionHistory)
.where(PermissionHistory.target_table == "thing")
.where(
PermissionHistory.permission_type.in_(
("Water Chemistry Sample", "Water Level Sample")
)
)
).scalar_one()
)
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destination_row_count for permissions is computed without the same constraints used to build destination_series (inner join to Thing, Thing.name non-null). This can make the reported destination row count inconsistent with the key series (e.g., orphaned PermissionHistory rows or unnamed Things). Align the count query with _permissions_destination_series filters (and any additional filters like end_date is NULL).

Suggested change
destination_row_count = int(
session.execute(
select(func.count())
.select_from(PermissionHistory)
.where(PermissionHistory.target_table == "thing")
.where(
PermissionHistory.permission_type.in_(
("Water Chemistry Sample", "Water Level Sample")
)
)
).scalar_one()
)
destination_row_count = int(destination_series.shape[0])

Copilot uses AI. Check for mistakes.
Comment on lines +210 to +223
sql = (
select(
Thing.name.label("point_id"),
PermissionHistory.permission_type.label("permission_type"),
PermissionHistory.permission_allowed.label("permission_allowed"),
)
.select_from(PermissionHistory)
.join(Thing, Thing.id == PermissionHistory.target_id)
.where(PermissionHistory.target_table == "thing")
.where(
PermissionHistory.permission_type.in_(
("Water Chemistry Sample", "Water Level Sample")
)
)
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_permissions_destination_series pulls all PermissionHistory rows for the permission types, including historical/ended records. This can create false matches (a well may match a historical value even if the current active permission differs) and inflate extra counts over time. Consider restricting to the active/current records (e.g., end_date IS NULL) and, if multiples exist, selecting the latest start_date per (target_id, permission_type) to match how the API surfaces permissions.

Suggested change
sql = (
select(
Thing.name.label("point_id"),
PermissionHistory.permission_type.label("permission_type"),
PermissionHistory.permission_allowed.label("permission_allowed"),
)
.select_from(PermissionHistory)
.join(Thing, Thing.id == PermissionHistory.target_id)
.where(PermissionHistory.target_table == "thing")
.where(
PermissionHistory.permission_type.in_(
("Water Chemistry Sample", "Water Level Sample")
)
)
# Select only active permission records (end_date IS NULL) and, for each
# (target_id, permission_type), keep the record with the latest start_date.
latest_active_subq = (
select(
PermissionHistory.target_id.label("target_id"),
PermissionHistory.permission_type.label("permission_type"),
func.max(PermissionHistory.start_date).label("max_start_date"),
)
.where(PermissionHistory.target_table == "thing")
.where(
PermissionHistory.permission_type.in_(
("Water Chemistry Sample", "Water Level Sample")
)
)
.where(PermissionHistory.end_date.is_(None))
.group_by(
PermissionHistory.target_id,
PermissionHistory.permission_type,
)
.subquery()
)
sql = (
select(
Thing.name.label("point_id"),
PermissionHistory.permission_type.label("permission_type"),
PermissionHistory.permission_allowed.label("permission_allowed"),
)
.select_from(PermissionHistory)
.join(
latest_active_subq,
(
(PermissionHistory.target_id == latest_active_subq.c.target_id)
& (
PermissionHistory.permission_type
== latest_active_subq.c.permission_type
)
& (
PermissionHistory.start_date
== latest_active_subq.c.max_start_date
)
),
)
.join(Thing, Thing.id == PermissionHistory.target_id)

Copilot uses AI. Check for mistakes.
@TylerAdamMartinez TylerAdamMartinez deleted the transfer-fix-permission-history branch February 26, 2026 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants